Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

نویسندگان

Felix A. Gers

Juan Antonio Pérez-Ortiz

Douglas Eck

Jürgen Schmidhuber

چکیده

Unlike traditional recurrent neural networks, the Long ShortTerm Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ≤ 10) of the context sensitive language abc to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.

متن کامل

منابع مشابه

Improving Long-Term Online Prediction with Decoupled Extended Kalman Filters

Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform traditional RNNs when dealing with sequences involving not only short-term but also long-term dependencies. The decoupled extended Kalman filter learning algorithm (DEKF) works well in online environments and reduces significantly the number of training steps when compared to the standard gradient-descent algorithms. Prev...

متن کامل

LSTM recurrent networks learn simple context-free and context-sensitive languages

Previous work on learning regular languages from exemplary training sequences showed that long short-term memory (LSTM) outperforms traditional recurrent neural networks (RNNs). We demonstrate LSTMs superior performance on context-free language benchmarks for RNNs, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM ...

متن کامل

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

The long short-term memory (LSTM) network trained by gradient descent solves difficult problems which traditional recurrent neural networks in general cannot. We have recently observed that the decoupled extended Kalman filter training algorithm allows for even better performance, reducing significantly the number of training steps when compared to the original gradient descent training algorit...

متن کامل

Learning Nonregular Languages: A Comparison of Simple Recurrent Networks and LSTM

In response to Rodriguez's recent article (2001), we compare the performance of simple recurrent nets and long short-term memory recurrent nets on context-free and context-sensitive languages.

متن کامل

On learning context-free and context-sensitive languages

The long short-term memory (LSTM) is not the only neural network which learns a context sensitive language. Second-order sequential cascaded networks (SCNs) are able to induce means from a finite fragment of a context-sensitive language for processing strings outside the training set. The dynamical behavior of the SCN is qualitatively distinct from that observed in LSTM networks. Differences in...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

نویسندگان

چکیده

منابع مشابه

Improving Long-Term Online Prediction with Decoupled Extended Kalman Filters

LSTM recurrent networks learn simple context-free and context-sensitive languages

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Learning Nonregular Languages: A Comparison of Simple Recurrent Networks and LSTM

On learning context-free and context-sensitive languages

عنوان ژورنال:

اشتراک گذاری